Fast Computation of Normalized Edit Distances
نویسندگان
چکیده
The Normalized Edit Distance (NED) between two strings X and Y is defined as the minimum quotient between the sum of weights of the edit operations required to transform X into Y and the length of the editing path corresponding to these operations. An algorithm for computing the NED has recently been introduced by Marzal and Vidal that exhibits O(mn2) computing complexity, where m and n are the lengths of X and Y. We propose here an algorithm that is observed to require in practice the same O(mn) computing resources as the conventional unnormalized Edit Distance algorithm does. The performance of this algorithm is illustrated through computational experiments with synthetic data, as well as with real data consisting of OCR chain-coded strings.
منابع مشابه
Parallel algorithms for fast computation of normalized edit distances
We give work-optimal and polylogarithmic time parallel algorithms for solving the normalized edit distance problem. The normalized edit distance between two strings X and Y with lengths n m is the minimum quotient of the sum of the costs of edit operations transforming X into Y by the length of the edit path corresponding to those edit operations. Marzal and Vidal proposed a sequential algorith...
متن کاملComputation of Normalized Edit Distance and Applications
Given two strings X and Y over a finite alphabet, the normalized edit distance between X and Y, d( X , Y ) is defined as the minimum of W ( P ) / L ( P ) , where P is an editing path between X and Y , W ( P ) is the sum of the weights of the elementary edit operations of P, and L ( P ) is the number of these operations (length of P). In this paper, it is shown that in general, d ( X , Y ) canno...
متن کاملFast Cyclic Edit Distance Computation with Weighted Edit Costs in Classification
Cyclic edit distances are a good measure of contour shapes dissimilarity. A Branch and Bound algorithm that speeds up the computation of cyclic edit distances with arbitrary weights for the edit operations is presented. The algorithm is modified to work with an external bound that further accelerates the computation when applied to classification problems.
متن کاملFast Suboptimal Algorithms for the Computation of Graph Edit Distance
Graph edit distance is one of the most flexible mechanisms for error-tolerant graph matching. Its key advantage is that edit distance is applicable to unconstrained attributed graphs and can be tailored to a wide variety of applications by means of specific edit cost functions. Its computational complexity, however, is exponential in the number of vertices, which means that edit distance is fea...
متن کاملAn Eecient Uniform-cost Normalized Edit Distance Algorithm
A common model for computing the similarity of two strings X and Y of lengths m, and n respectively with m n, is to transform X into Y through a sequence of edit operations which are of three types: insertion, deletion, and substitution of symbols. The model assumes a given weight function which assigns a non-negative real cost to each of these edit operations. The amortized weight for a given ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Pattern Anal. Mach. Intell.
دوره 17 شماره
صفحات -
تاریخ انتشار 1995